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Abstract 

Surprisingly, many researchers and students assume that the univariate distribution has a 
single ubiquitous "bell" shape, perhaps because most books only portray the "standard 
normal" or normal z-score distribution. This paper will show that the normal curve can have 
infinitely many appearances. 
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There Are Infinitely Many Normal Distributions: 



Not All Normal Distributions Are Standard Normal 



Many univariate statistical methods assume that dependent variable data have a 
univariate normal distribution, and some statistical methods assume that the error scores are 
normally distributed (Thompson, 1992). Almost always books portray only a single case of 
the infinitely many univariate normal distributions: the "standard normal" or "normal z- 
score" distribution, thus implicitly teaching the misconception that the normal distribution 
"bell" takes a single shape. In reality, the normal curve can take infinitely many forms, each 
having skewness and kurtosis coefficients of zero (Bump, 1991 ; Burdenski, 2000; Henson, 
1999), but diffe* ing in appearance (e.g., apparent -width and apparent height). 

A thorough understanding of normal distributions is necessary because of their 
fi-equent occurrence in nature as well as in testing, particularly when large samples are 
considered. Examples of data which yield normal distributions include the height and weight 
of aduk human beings, IQ scores, chances of throwing a certain total score when throwing 2 
or more dice, leaf length of mesquite trees, and standardized test scores. Combining the 
concepts of standard scores and normal distributions gives us a valuable tool for analyzing 
distributions of many kinds of variables (Hinkle, Wiersma, and Jurs, 1998). 

A normal distribution is theoretical, symmetric, bell-shaped, unimodal, and fits the 
equation 




where Fis the height of the ordinate for any given value of X in the distribution, k is equal 
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to 3.1416, the ratio of the circumference of a circk to its diameter, and e is 2.7183, the base 
of the natural logarithm system. The mean of the distribution scores is represented by |i and 
the standard deviation of the score distribution is a. This equation was developed by the 
French mathematician Abraham Demoivre (1667-1754) who based it on his observations of 
games of chance, using it to predict the probability of different outcomes (Hinkle, Wiersma, 
&Jurs, 1998). 

Spatz (2001) reports that about the beginning of the 19* century, Carl Friedrich 
Gauss determined the mathematics of the curve, using it as a way to describe random error 
in astronomical observations. This “Gaussian” curve was recognized as an accurate 
portrayal of the results of random variation, and it began to be called the law of error. An 
early and influential advocate of the normal curve, Belgian Adolphe Quetelet demonstrated 
that many empirical biological and social measurements correspond with this theoretical 
curve. Florence Nightingale, who pioneered the use of statistics to improve health care, 
called Quetelet “the founder of ‘the most important science in the whole earth’” (Spatz, 
2001, p. 119). In his attempt to standardize statistical terminology at the end of the 19* 
century, Karl Pearson began to call it the normal distribution, probably because of the large 
number of types of measurements which fall into this pattern. 
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Figure 1 The Theoretical Normal Distribution (Lowery, 1999) 
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Normal distributions (See Fig. 1) are continuous and asymptotic to (never touch) the 
X axis. The area under the curve is equal to one, and the mean, median, and mode are equal 
to each other and correspond to the peak of the curve. Convention and convenience dictate 
that the curve is drawn from -3o to +3o from the mean. The steepest points of the curve, 
called inflection points, are located at exactly - 1 a and +1 a, with approximately 68% of the 
scores lying between these points. Approximately 95% of the scores lie between -2o and 
+2o, and over 99% lie between -3o and +3o. The normal bell-shaped curve that is most 
familiar is in reality the ''standard normal curve representing a standardized distribution of 
z-scores and is not the only normal curve” (Henson, 1999, p. 196). Hinkle et ai. (1998), 
point out that because no specific mean or standard deviation is specified in the formula, 
every possible mean and standard deviation combination has its own individual normal 
distribution, resulting in what they call a “family of distributions” (p. 91). Because there is 
an infinite number of possible combinations of means and standard deviations, the number 
of possible normal distributions is also infinite. This symmetrical curve represents a 
situation where there are many scores near the center of the distribution and fewer and fewer 
scores as you move away from the center. 

According to Bump (1991), normality of a curve is evaluated by using skewness and 
kurtosis in addition to mean and standard deviation. Mean, standard deviation, skewness, 
and kurtosis taken together are called the first four moments of a normal distribution. In the 
physical sciences, moment is a term describing the measurement of the tendency of a force 
to cause rotation around a point or axis of an object. When the forces pressing down on 
either side of the axis of the object, calculated for each side by the mass times the distance 
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from the axis, are equal the object is in a balanced state. “In statistics, the term ‘moment’ 
denotes class frequencies that are analogous to the forces exerted in the previous example” 
(Bump, 1991, p. 3). In a histogram of a normally distributed data set, “the ‘moment’ 
contribution of each column is measured by the product of the class frequency (f) and the 
corresponding deviation (x) from the origin. The sum of the 6c products, divided by the total 
frequencies, gives a net measure called the first moment or the mean” (p. 3). 
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Figure 2 Distributions of Exam Scores in Six Sections of a Statistics Course (Lowery, 1999) 



All normal distributions are symmetrical; however, not all symmetrical distributions 
are normal. Examples of non-normal symmetrical distributions may include bimodal 
distributions as in Section F in Fig. 2, rectangular uniform distributions, kurtic distributions 
as in Sections D and E and skewed distributions as in Sections A and B. Kurtosis refers to 
the degree of peakedness of the distribution, and distributions may be either leptokurtic 
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(extremely peaked, with most scores near :he center), as in Fig. 2, Sec. E or platykurtic 
(flattened, with a more uniform distribution of scores) as in Fig. 2, Sec. D. A mesokurtic 
distribution, represented by section C, is more moderate than either the platykurtic or 
leptokurtic distribution. Kurtosis can be calculated using the formula 



Coefficient of Kurtosis = 
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Distributions which have a large number of scores near one end or the other of the 
scale with fewer scores on either side are said to be skewed and are also not normal 
distributions (Hinkle et al., 1 998). Section B above is said to be positively skewed while 
Section A is negatively skewed. Skewness can be calculated by the formula 



Coefficient of Skewness = 
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Nonstandard normal distributions can be transformed into standard normal 
distributions (or normalized). This is particularly useful when interpreting standardized test 
results, such as those from achievement tests, IQ tests, and College Entrance Examinations. 
Transformed scores help to overcome some of the difficulties and disadvantages involved in 
reporting test results as z scores, including the negative connotations of receiving a negative 
score as perceived by students and parents, the possibility of interpreting a z score of zero as 
having answered no questions correctly, and the implication by the usual 2 decimal places of 
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a z score of a degree of precision possibly not possessed by the raw scoie. In order to 
transform a distribution, the raw scores are first converted to z scores (standard scores) and 
then changed into a distribution with a predetermined mean and standard deviation by the 
formula 



X' ^{s'\z)^X' 

where X' is equal to the score (new or transformed) of an individual, s' is equal to the 
predetermined standard deviation of the distribution, z is the individual’s standard score, and 
X' is the predetermined mean of the distribution (Hinkle et al., 1998) 

The heuristic illustration which follows presents a univariate data set (Thompson, 
2003) with 100 cases that have a univariate normal distribution. Then various monotonic 
transformations are invoked to show that the distribution can take many forms. For each 
transformed distribution, summary statistics (i.e., the mean and moments about the mean, 
including standard deviation, skewness, and kurtosis) are computed, along with graphical 
overlays of the normal curve on histograms for each data set. 




9 



Normal Distributions 



9 




Figure 3 Histogram of the Original Data Set with Overlay of Normal Curve 

The parameters of the sample data set included a mean of 0.0000 and a median of 
0.000. The variance was 0.252, the standard deviation was 0.50247 and the scores had a 
range of 2.60, with a minimum of -1 .30 and a maximum of 1 .30. Skewness was 0.000, and 
kurtosis was -0.090. Figure 3 shows a histogram of the original data set overlaid by the 
normal curve, and reveals that the data fits the normal curve very well. 
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Figure 4 Histogram of the Additive Data Set with Overlay of the Normal Curve 
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The next step was to apply in additive constant of 2 to each score in the original 
d ata set and to determine its parameters. The new mean was 2.0000 and the new median 
was also 2.0000. The variance was 0.252, the standard deviation was 0.50247, and the 
scores had a range of 2.60, with a minimum of 0.70 and a maximum of 3.30. Skewness 
was 0.000 and kurtosis was -0.090. It can be seen, therefore, that the use of an additive 
constant changes the mean, median, and minimum and maximum scores. However, it 
does not affect the variance, standard deviation, range, skewness, or kurtosis. From the 
histogram in Figure 4 it can be seen that the only difference from the graph in Figure 3 is 
in the new numbers along the abscissa, which effectively move the new histogram to the 
right along the abscissa, but do not affect its size or shape. 



Figure 5 Histogram of the Multiplicative Data Set with Overlay of the Normal Curve 

A multiplicative constant of 2 was then applied to each score in the original data 
set, resulting in a new mean and new median of 0.0000. The variance was 1.010, the 
standard deviation was 1.00494, and the range was 5.20 with a minimum of -2.6 and a 
maximum of 2.60. Again, skewness was 0.000 and kurtosis was -0.090. A multiplicative 
constant does not affect the skewness or kurtosis, but does affect the variance, standard 
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deviation, and range. The mean is affected by multiplicative c- nstants if (and only if) the 
original mean is not zero. Figure 5 shows a histogram of the results of application of the 
multiplicative constant to the original data set. However, one should notice that the scale 
of the abscissa has been changed by SPSS, making the appearance of the histogram 
exactly like those in Figures 3 and 4. If the original scale (points increasing by 0.25 
instead of 0.5) had been used, the curve would appear to be much wider in proportion to 
its height. However, it would remain a normal curve. 
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Figure 6 Histogram of Additive and Multiplicative Data Set with Overlay of the Normal Curve 

Finally, an additive constant of 2 and then a multiplicative constant of 2 were 
applied to the original data set, resulting in a mean of 4.0000 and a median of 4.0000. 
The variance was 1.010, the standard deviation was 1.00494, and the range was 5.20 
with a minimum of 1 .40 and a maximum of 6.60. Skewness was still 0.000 and kurtosis 
was -0.090. Applying an additive constant and then a multiplicative constant resulted in 
changes from the original data set in mean, median, variance, standard deviation, and 
ranges. However, it did not affect skewness or kurtosis. As can be seen in Figure 6, the 
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appearance of the histogram did not change from Figure 5, although it is moved along the 
abscissa in a positive direction. 

Summary 

The concept of the normal curve is extremely important in statistics. Its 
mathematical properties are useful in understanding many distributions in nature, because 
often they are normally or near-normally distributed. It can be seen that the width to 
height ratio can change an infinite number of times as the mean and standard deviation 
change, making the curve appear slightly different each time, although it is still a normal 
curve. Care must be exercised in determining whether or not a distribution exhibits 
normality. Trying to judge normality by simply looking at the curve may be deceptive. 
The final judgment must be made by checking the criteria for normality; symmetry, bell 
shape, unimodality, and fit to the equation for normality. 
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